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EPITOPES Of THE ENV PROTEIN OF THE HEPATITIS C VIRUS 

This invention relates to epitop s of the env 
protein of the hepatitis c virus. 

More particularly, this invention relates to 
peptides comprising epitopes of the hepatitis C virus 
(HCV) localized in the envelope surface viral protein 
(env) , which are capable of reacting with antisera 
and/ or with monoclonal antibodies and it also relates 
to the amino acids sequence of said epitopes, as well 
as to the nucleotide sequence coding for the sames. 

The documents cited with a numeral reference are 
listed at the end of this disclosure. <* 

The virus HCV is believed to be responsible for 
the hepatites classified as non-A/non-B (PT-NANB) (1) . 
The existence of an etiological agent for NANB 
hepatitis has been also proved by Alter et al. (2) . The 
virus has been identified as an RNA virus, of positive 
polarity, and the genome, in the form of cDNA, has been 
wholly" cloned and sequenced. From an analysis of th«=- 
sequence it turned out that the sequence in question 
consists of about 10,000 ribonucleotides and forms a 
single reading frame that potentially codes for a 
Single amino acid chain. This same organization is also 
present in other viral families such as those of 
f la vi virus and of Pestivirus; however, other structural 
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characteristics make it uncertain to set forth a 
precise taxonomic position of HCV. 

The cloning of a first portion of the genome has 
heen disclosed by Choo, Q.L. «t 1- (3). and the 
5 sequence has been published in the European Patent EP 

88310922.5. The regions identified correspond to the 
so-called nonstructural regions which, in a way similar 
to that of flavivirus, have been called HS1, NS2 , NS3, 
NS4 and NS5. 

More recently, structural regions, coding for 
capsid and for surface proteins, have been cloned and 
sequenced. Such sequences have been published by 
Okamoto, Ho et al. (4) and in the European Patent 
application EP 90302866.0. 

in order to identify immunological markers of HCV 
infection, large amounts of viral antigens are needed. 
However, differently from other hepato tropic viruses, 
such as HBV and HDV, the concentration of HCV in the 
liver and in the blood is very low and, differently 
fr omthe virus of hepatitis A - -(HOT , HCV - cannot be 
grown in^m. Therefore it is not available a good 
natural source of viral antigens. 

Accordingly, the preparation o£ immunological 
tests reguires tne availability of synthetic peptides 
capable of mimicking the immunological activity of 
viral antigens. To tbat aim, the identification of 
specific; protein portions, denominat d epitopes. 
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capable of reacting with antibodies is necessary due t 
the short length of synthetic peptides. Moreover, it is 
well known that tests which esrploy just the epit pe of 
the protein are more sensitive and more accurate. 
5 Up to the present time it has been impossible to 

identify portions with antigenic activity of HCV env 
protein, capable of reacting with antibodies and, 
therefore, the env protein or portions thereof has 
never been employed for immunological tests. 

10 It is weli known that SNA viruses are 

characterized by a high frequency of spontaneous 
mutation. In the case of HCV, variable and 
hypervariable domains have been identified in the 
sequences corresponding to the surface proteins (5 and 

15 EP 004191. 82A1) , possibly related to viral mechanisms 

of escaping of the immune response. Moreover NANB 
hepatitis becomes a chronic disease in about 50 % of 
patients. It is therefore very useful to identify 
epitopes of surface proteins both for diagnosis and for 

20 prognosis purposes* 

The Authors of this invention have identified 
variable regions with a high antigenic activity of the 

amino acid sequence of the env protein, and they have 

.. . 
found that such regions correspond to epitopes of said 

25 protein. The Authors also have identified some variants 

of such regions by means of amplification of nucleic 
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acids from serum samples; among such regions, ne is 
coded by a HCV variant not disclosed bef re. 

The endemic distribution of the different viral 
variants of HCV virus makes it necessary to prepare 
5 assays able to detect epitopes of the different 

variants. 

The Authors have synthesized such epitopes ia 
3cifcEfl for immunological assays on serum samples. 

The availability of an anti-env marker with 

10 serological characteristics such as those of the object 

of this invention, can lead to more specific tests, 
which can be particularly employed for anti-HCV scree- 
ning of blood samples. Indeed, an analysis, carried out 
by contreras et al. (6) just employing the test based 

15 on the clOO protein gives rise to a remarkable number 

of false positive results, with no precise ident- 
ification of the infected samples. 

Finally, as tests which employ amplification 
procedures such as the PGR (polymerase chain reaction) 

20 are not exploitable for massive screenings, it is 

useful to correlate the positive results obtained with 
the assay realized by the Authors and the results 
obtained with the PGR. 

Accordingly, it is a specific object of this 

25 invention an amino acid sequence comprising an epitope 

of the env protein of the HCV virus, preferably in the 
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portion fr m the amino acid 209 to the amino acid 259 „ 
according to the numeration as given in (3) . 

According to some preferred emb diments f this 
invention, said sequences are included in the following 
5 group of sequences: SEQ ID Nl, SEQ 10 N2 and SEQ ID N3 , 
preferably from the amino acid 13 to the amino acid 46 
of SEQ ID Nl, more preferably from the amino acid 21 to 
the amino acid 30 of SEQ ID Nl; alternatively from the 
amino acid 13 to the amino acid 46 of SEQ ID N2, 
10 preferably from the amino acid 21 to the amino acid 30 
of SEQ ID N2; alternatively from the amino acid 13 to 
the amino acid 47 of SEQ ID N3, preferably from the 
amino acid 21 to the amino acid 31 of SEQ ID N3 . 

It is a further object of the invention a peptide 
15 according to any of said amino acid sequences, prefe~ 
rably of synthetic origin, more preferably cyclized by 
means of reaction of two residues of cysteine. 

Again it is an object of this invention a 
nucleotide sequence coding for an epitope of the env 
20 protein ) "preferably comprised in one of the sequences 
of the following group: SEQ ID Nl, SEQ ID N2, and SEQ 
ID N3; preferably comprising at least one fragment of 
10 nucleotides of SEQ ID N3, more preferably the entire 
sequence of SEQ ID N3. 
25 This invention will be now disclosed in some 

working examples of the same, with reference to the 
following fig^iar s, wherein? 
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= Figures 1A, IB, and 1C repr sent the hydrophilie 
profiles respectively of the env 1, env 2 and env 3 
variants . 

EXAMPLE 1 Tdairfclffir rfi tHQ" of 3 variants in . thfl- t^ iQB-Q£ 
5 <-n* Artv rr-nt ftin and synthesis Q£ £&S SQEEesgoadJjig 

peptides 

An investigation carried out by means of nucleic 
acid amplification procedures from serum samples (PCR, 
7) allowed the identification of 3 main variants of the 

10 env surface protein to be- carried out, said variants 
being called respectively env 1, env 2, env 3, and 
comprising the sequences disclosed respectively as SEQ 
ID Ml, SEQ ID N2 and SEQ ID N3 . 

The variant env 1 and env 2 are comprised in 

15 viral variants respectively known by those skilled in 
the art as HCV Al (american isolate) and HCV Jl (japan 
isolate) . The variant env 3 is coded by a viral variant 
Which is not included in any HCV isolate disclosed up 
to the present invention, denominated HCV 3. Such 

20 variant differentiates mainly by the insertion of a 
histidine residue into a region delimited by 2 cyste- 
ines, which modifies the hydrophilic profile of the 
genie product (Figures 1A, IB and 1C) . Such modif- 
ication is of particular relevance for the analogy with 

25 the transmembrane region of the HXVl surface protein 
(8, 9) . 
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Oligopeptides comprising respectively th seque- 
nce of the env protein from the amino acid 13 to the 
amino acid 32 of the SEQ ID Nl (the env 1 variant) ; the 
sequence of the env protein from the amino acid 13 to 
5 the amino acid 32 of SEQ ID N2 (the env 2 variant) ; the 
sequence of the env protein from the amino acid 13 to 
the amino acid 33 of SEQ ID N3 (the env 3 variant) are 
synthesized according to Merrifield's method (10), 
employing as the solid phase a polyamide resin "Pepsin 
10 K polyamide Kieselguliz" (Milligen, Novato, 
California) , which had been previously functionalized 
with ethilendiamine and with 4~ ( alpha- Fmoc-amino- 
2 ' , 4 • -dimethoxybenzyl) phenoxyacetic acid . The amino 
acids employed for the synthesis are protected on the 
15 side chains by tert-butyl groups and on the alpha-amino 
position with the F-moc group 

(9~f luoro-methyloxycarbonyl group) . The guanidinium 
group of arginine and the imidazole group of histidine 
is respectively protected with the substituents 
20 consisting of the 2,2,5,7,8-pentamethyichroman- 
6-sulf onyl and trityl groups. The car boxy group of the 
amino acids employed is activated by the formation of 
an ester -type bond with the pentaf luorophenyl group. 
The synthesis is performed with the Milligen 9050 
25 synthesizer (Novato, California) employing the 
continuous flow method. The removal of protection and 
the separati n of the peptides from the resin are 
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carried out by treatment with trifluor acetic acid. The 
peptid sequence is checked with an automatic 
microsequencer (Port an Instruments) . 

EXAMPLE 2 
5 cyglization of-ne afclflfiS 

Oligopeptides comprising respectively the seque- 
nce of the env protein from the amino acid 21 to the 
amino acid 30 of the SEQ ID Nl (the env 1 variant) ; the 
sequence of the env protein from the amino acid 21 to 
the amino acid 30 of SEQ "ID N2 (the env 2 variant); the 
sequence of the env protein from the amino acid 21 to 
the amino acid 31 of SEQ ID N3 (the env 3 variant) are 
synthesized according to the Example 1. 

The cyclization of a fraction of the peptides is 
carried out in the following ways the peptide is 
dissolved in water to a concentration of 0.1 mg/ml= The 
pH value is adjusted to 7 with 1M NH^OH- Potassium 
ferricyanide i* then added slowly to the solution (400 

mg K_Fe (CN) £ in 200 ml of water) till persistence of 

.3 o ... . - 

the yellow colour. The disappearance of the free SH 

groups is obtained employing the method of Edman (11) .. 

Alternatively the peptide is dissolved at 0.2 

mg/ml in distilled/deionized water (Milliq) and the pH 

is adjusted to pH 8 using a solution of 3M NH^Cl. The 

solution is allowed to stir for four days and the loss 

of the free sulphide groups is monitored using the 

Edman titration. Briefly, 24 mg of 5-5 » dithio^bis 
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(2~nitrobenzoic acid) is diss Ived in 5 ml of phosphate 
buffer pH 7. 20 jaI of this soluti n is mixed with 1 ml 
of the peptide solution and the absorbance is read at 
412 nm. After four days 96% of the free sulphide groups 
5 are disappeared. 

EXAMPLE 3 IgnnunQloqicaJL assay 

In order to determine the immunogenicity of 
linear and cyclized peptides described in EXAMPLE 2, an 
ELXSA assay is carried out. 
10 The cyclic and linear peptides are dissolved in 

50 mM carbonate buffer, pH 9.6 at a concentration of 5 
fig/ml. 200^1/well of a microtitration plate is 
dispensed and incubated for 1 hIK at 37°C. The 
overcoating of the wells is performed by coating to the 
15 empty wells 300 fil of a solution containing 50 mM 

Tris-HCl pH 7.4 and 0.2% bovine serum albumin (BSA, 
Sigma, Fraction V) . The plates are incubated for 2 hrs 
at room temperature. 

Finally 300 /il/well of a solution containing 10% 
20 sucrose, 4% polyvinylpirrolidone and 9% NaCl is added 
and left for 1 hr at room temperature. 

The ELXSA assay is performed by dispensing 200 
Ail/well of sera, previously diluted, using a HCV 
negative serum, as sample diluent. The samples are 
25 incubated for 1 hr at 37 °C. The plates are then washed 
five times with a solution containing 0.05% Tween-20, 
0.1% BSA in 50mM phosphat buffer pH 7.4 (washing 
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buffer) and incubated for 1 far at 37 "C with 200 #1 of a 
solution containing goat IgG anti-human XgGs, 
conjugated with horse radish peroxidase (HRP) . 

After five washings with washing buffer the 
5 plates are incubated for 3 0 min with a 
chromogen-substrate solution C tetr amethy lbenz idine and 
3% hydrogenperoxide) . The reaction is stopped with IN 
sulphuric acid and the absorbance is read at 450nm. 

The serum utilized (21) belongs to the panel BBI 
10 mixed HCV (Boston Biomedica Inc.) . The control HCV 
negative serum gives constantly values lower than 0.04. 

The results are shown in the following Table 1. 
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Table 1 

ELXSA assay with serum 21 BBI 





env 


1 


env 


2 


env 


3 




cyclic 


linear 


cyclic 


linear 


cyclic 


linear 


serum 


00 


OD 


OD 


OD 


OD 


OD 


dil. 


450 


450 


450 


450 


450 


450 



1:20 


2.150 


0.141 


1.648 


0. 


114 


1.777 


0.132 


1:40 


1.028 


0.093 


0.841 


0. 


095 


0.980 


0.089 


1:80 


0.615 


0.078 


0.512 


0. 


071 


0.546 


0.076 


1:160 


0.243 


0. 074 


0.221 


0. 


064 


0.150 


0.070 


1:320 


0.098 


0.061 


0.093 


0. 


051 


0.090 


0.054 


1:640 


0.061 


0.060 


0.048 






0.054 


0.056 



The results show that env 1, env 2 and env 3 
peptides are able to react with anti HCV sera. The 
reactivity is greatly increased when such peptides ar 

20 made cyclic and therefore have a conformational 

structure similar to the corresponding region of the 
whole env protein. The reactivity decreases 
proportionally with serum diluitions, thus indicating 
that the reaction is specific 

25 This invention has been disclosed with specific 

reference to some preferred embodiments of the same, 
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but it is to be understood that modifications and/er 
changes ean be introduced by those who are skilled in 
the art without departing from the spirit and scope £ 
the invention for which a priority right is claimed . 
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LIST OF THE SEQUENCE CHARACTERISTICS 

SEQ ID Nl 

SEQUENCE TYPEs Nucleotide with corresponding peptide 
LENGTH OF THE SEQUENCE; 153 base pairs 
5 CONFORMATION: single helix 

TOPOLOGY : linear 

MOLECULAR TYPE: cDNA from. genomic RNA 
HYPOTHETIC SEQUENCE: no 
ANTI-SENSE: no 
10 ORIGINAL SOURCE: HCV virus variant Al 

EXPERIMENTAL SOURCE: genie library from viral isolate 
CHARACTERISTICS: coding for a portion of env protein 
variant env 1 

IDENTIFICATION METHOD: experimental 
15 PROPERTY: coding sequence 

AAC TCG A6C ATT GTB TAC SAG GCT GCC GAC 30 
Asn Ser Ser lie Val Tyr Glu Ala Ala Asp 
15 lO 
GCC ATC CTG CAC ACT CCG GGG TGC GTC CCT bO 
Ala Jle Leu His Thr Pro Gly Cys Val Pro 



20 



25 



11 15 20 

"TGC GTT CGC GAG GGT AAC GCC TCG A6G TGT 90 

Cys Val Arg Glu Gly Asn Ala Ser Arg Cys 

21 25 30 

TGG GTG GCG ATC ACC CCC ACG GTG GCC ACC 120 

Trp Val Ala lie Thr. Pro Thr Val Ala Thr 

31 35 «° 

AGG GAT G6C AAA CTC CCC ACA GCG CAC GTT 150 

Arg Asp Gly Lys Leu Pro Thr Ala His Val 

ai «v 5° 

CGA 
Arg 
51 
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SEQ ID N2 

SEQUENCE TYPE: Nucleotide with corresponding protein 
LENGTH OF THE SEQUENCE: 153 base pairs 
CONFORMATION : single helix 
TOPOLOGY: linear 

MOLECULAR TYPE: cDNA from genomic RNA 
HYPOTHETIC SEQUENCE: no 
ANTI-SENSE: no 

ORIGINAL SOURCE: HCV virus variant Jl 

EXPERIMENTAL SOURCE: genie library from viral isolate 
CHARACTERISTICS: coding for a portion of env protein 
env 2 variant 

IDENTIFICATION METHOD: experimental 
PROPERTY: coding sequence 

AAC TCA ABC ATC BTG TAT GAG GCA GCA GAC 30 
Asn Ser Ser lie Val Tyr Glu Ala Ala Asp 



5 

TTG ATC ATG CAC ACC CCC GGG TBC GTG CCC 60 
Leu lie Met His Thr Pro Gly Cys Val Pro 

11 IS" 20 

20 TGC GTT CGG GAG AAC AAC CTC TCC CGC TGC 90 

Cys Val Arg Glu Asn Asn Leu Ser Arg Cys 

TGG GTA GCG CTC ACT CCC ACG CTT GCG GCC 130 
Trp Val Ala Leu Thr Pro Thr Leu Ala "Ala 

31 -3 5 
AGS AAT GTC AGC GTC CCC ACA GCA ACA ATA ISO 
Arg Asn Val Ser Val Pro Thr Ala Thr lie 

« 5 ° 

CGA 
Arg 
51 
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SEQ ID N3 

SEQUENCE TYPE: Nucleotide with corresponding protein 
LENGTH OF THE SEQUENCE: 156 base pairs 
CONFORMATION; single helix 
TOPOLOGY: linear 

MOLECULAR TYPE: cDNA from genomic RNA 
HYPOTHETIC SEQUENCE: no 
ANTI -SENSE: no 

ORIGINAL SOURCE: HCV virus variant 3 

EXPERIMENTAL SOURCE: genie library from viral isolate 
CHARACTERISTICS: coding for a portion of the env 
protein env 3 variant 
IDENTIFICATION METHOD: experimental 
PROPERTY: coding sequence 

AAC TCA AGT ATT GTG TAT GAG GCA GCG GAC 30 

Asn Ser Ser lie Val Tyr Glu Ala Ala A&p 
1 5 lO 

CT6 ATC ATG CAC ACQ CCC GGG TGC GTG CCC 60 

Leu He Met His Thr Pro 61 y Cys Val Pro, 
11 15 20 

TGC -GTT CBS GAA BGh BAC AAC CAC TCC CGC 90 

Cys Val Arg Glu Gly Asp Asn His Ser Arg 

■El - 25 - .30 

TGC TGG GTA GCG CTC ACT CCC ACT CTC GCG 120 

Cys Trp Val Ala Leu Thr Pro Thr Leu Ala 
31 35 *0 

GCC AGG A AT AGC AGC GTC CCC ACC ACG ACA 150 

Ala Arg'Asn Ser Ser Val Pro Thr "Thr Thr 
til -45 -50 

ATA CGA 

lie Arg 

51 
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CLAIMS 

1. An amino acid sequence characterized in "that 
it comprises an epitope of the protein env of the virus 
HCV„ 

2. An amino acid sequence according to claim 1, 
characterized in that it is comprised within the 
portion from the amino acid 209 to the amino acid 259 
according to the numbering of Choo, Q.-L. et al., 
Science (1988), 244:359362 (3). 

3 . An amino acid sequence according to claim 2 , 
characterized in that it is comprised in the SEQ ID Nl. 

4 . An amino acid sequence according to claim 3 , 
characterized in that it comprises the portion from the 
amino acid 13 to the amino acid 46 of SEQ ID Nl. 

5 . An amino acid sequence according to claim 3 , 
characterized in that it comprises the portion from the 
amino acid 21 to the amino acid 30 of SEQ ID Nl. 

6. An amino acid sequence according to claim 2, 
characterized in that it is comprised in the SEQ ID N2. 

7 . An amino acid sequence according to claim 6 , 
characterized in that it comprises the portion from the 
amino acid 13 to the amino acid 46 of the SEQ ID N2. 

8 . An amino acid sequence according to claim 6 , 
characterized in that it comprises the portion from the 
amino acid 21 to the amino acid 30 of SEQ ID N2. 



WO 93/02103 



PCT/nr92/0O081 



10 



IS 



20 



18 



9. An amino acid sequence according to claim 2, 
characterized in that it is c mprised within the SEQ ID 
N3. 

10. An amino acid sequence according to claim 9, 
characterized in that it comprises the portion from the 
amino acid 13 to the amino acid 47 of SEQ ID H3. 

11. An amino acid sequence according to claim 11, 
characterized in that it comprises the portion from the 
amino acid 21 to the amino acid 31 of SEQ ID N3 . 

12. Peptides characterized in that they have the 
amino acid sequence according to any one of the 
preceding claims . 

13. peptides according to claim 13 characterized 
in that they are synthetic peptides. 

14. Peptides according to claim 12 or 13 
characterized in that they have a conformational 
structure able to increase the immunogenic ity thereof. 

15. Peptides according to claim 14 characterized 
in that said conformational structure is achieved by 
reacting two residues of cysteine and by cyclizing the 
peptide . 

16. A nucleotide sequence coding for an epitope 

of the env protein. 

17. A nucleotide sequence according to claim 16, 
25 characterized in that it is comprised in the sequence 

SEQ ID HI. 
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18. A nucleotide sequence according to claim 16 r 
characterized in that it is comprised in th sequence 
SEQ ID N2. 

19. A nucleotide sequence according to claim 16, 
5 characterized in that it is comprised in the sequence 

SEQ ID N3. • 

20. A nucleotide sequence according to claim 19, 
characterised in that it comprises at least one 
fragment of 10 nucleotides of the SEQ ID N3. 

10 21. A nucleotide sequence according to claim 16, 

characterized in that it comprises the sequence SEQ ID 
N3. 
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FIG.1 



HYDR0PHIL1C PROFILE OF THE PROTEIC 
SEQUENCE HCVENV1, CALCULATED ON THE BASIS 
OF AN AVERAGE LENGHT OF 6 AM1N0ACID5 
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HYDROPHILIC PROFILE OF THE PROTEIC 
SEQUENCE HCVENV2 CALCULATED ON THE BASIS 
OF AN AVERAGE LENGHT OF 6 AMINOACIDS 




HYDROPHILIC PROFILE OF THE PROTEIC 
"SEQUENCE HCVENV3,. CALCULATED ON THE BASIS 
OF AN AVERAGE LENGHT OF 6 AMINOACIDS 
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